Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Supersonic-based parallel group-by aggregation
ZHANG Bing, SUN Hui, FAN Xu, LI Cuiping, CHEN Hong, WANG Wen
Journal of Computer Applications    2016, 36 (1): 13-20.   DOI: 10.11772/j.issn.1001-9081.2016.01.0013
Abstract502)      PDF (1253KB)(330)       Save
To solve the time-consuming problem of group-by aggregation operation in case of data-intense computation, a cache-friendly group-by aggregation method was proposed. In this paper, the group-by aggregation operation was optimized in two aspects. Firstly, designing cache-friendly group-by aggregation algorithm on Supersonic, an open-source and column-oriented query execution engine, to take the full advantage of column-storage on in-memory computation. Secondly, rewriting the algorithm with multi-threads to speed up the query. In this paper, four different parallel aggregation algorithms were put forward, respectively named Shared-Nothing Parallel Group-by Aggregation (NSHPGA) algorithm, Table-Lock Shared-Hash Parallel Group-by Aggregation (TLSHPGA) algorithm, Bucket-Lock Shared-Hash Parallel Group-by Aggregation (BLSHPGA) algorithm and Node-Lock Shared-Hash Parallel Group-by Aggregation (NLSHPGA) algorithm. Through a series of comparison experiment on different group power set and different number of worker threads, NLSHPGA algorithm was proved to have the best performance both on speed-up ratio and concurrency, which achieved 10x speedups on part of queries. Besides, considering Cache miss and memory utilization, the results shows that NSHPGA algorithm is suitable for smaller group power set, which was 8 in the experiment, and when getting larger, NLSHPGA algorithm performs better than NSHPGA algorithm.
Reference | Related Articles | Metrics